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^ i One of the major goals of cosmological observations is to test theories of 



structure formation. The most straightforward way to carry out such tests is 



^ ■ to compute the hkehhood function C, the probabihty of getting the data given 

the theory. We write down this function for a general galaxy survey. The full 
Ph| likelihood function is very complex, depending on all of the n-point functions 

of the theory under consideration. Even in the simplest case, where only 
^ ■ the two point function is non- vanishing (Gaussian perturbations), C cannot 

be calculated exactly, primarily because of the Poisson nature of the galaxy 
I distribution. Here we expand C about the (trivial) zero correlation limit. As 

^ . a first application, we take the binned values of the two point function as free 

parameters and show that C peaks at {DD — DR + RR)/DD. Using Monte 
Carlo techniques, we compare this estimator with the traditional DD/DR 
and Landy & Szalay estimators. More generally, the success of this expansion 
should pave the way for further applications of the likelihood function. 
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I. INTRODUCTION 



Recently, there has been a vast increase in both the quahty and quantity of data accu- 
mulated by observational cosmologists. The high quality of this observational work sets a 
standard for theorists to interpret the data as carefully and meaningfully as possible. For 
these purposes, the hkelihood function has become a powerful tool for all cosmologists. 
The hkelihood function, £, is the probability of a given data set given a theory; as such 
it is a natural way of connecting observations with theory. Examples of its application to 
cosmological data sets include cosmic microwave background anisotropics, gravitational 
lens studies and fitting absorption lines in QSO spectra. 

Until now, though, there has been no attempt to apply likelihood techniques to galaxy 
surveys. Instead, one typically determines the two-point function, ^ (r), or the power spec- 
trum, P{k), from the survey using appropriately chosen estimators (usually depending 
on the distribution of observed pairs of points). On the one hand, this is quite surprising: 
it would be wonderful to use all the information in a survey instead of just one small 
subset of it. If the galaxy distribution was Gaussian, so that only the two-point func- 
tion mattered, it would make sense to neglect all the other information in the catalogue. 
The galaxy distribution, however, is decidedly non-Gaussian, so the likelihood function, 
which incorporates information from all the n-point functions, seems more appropriate. 
On the other hand, the likelihood function for galaxies in a survey is a very complicated 
beast. Even if the underlying density field were Gaussian (so that only the two point 
function was non-zero), we would not necessarily expect an ad hoc estimator to be ideal. 
Further, and perhaps most importantly, the galaxies themselves are a Poisson (or other) 
sample of this underlying field, which introduces non-Gaussianity into the distribution 
of galaxy locations. In §11, we write down C and show that even for the Gaussian case it 
is computationally unfeasible to calculate. 

In §111, we propose one way to extract some information from the likelihood function: 
we expand C about its value when all correlations are zero. This weak correlation limit 
proves very fruitful. As an example, we consider a general 2-D (angular) survey (although 
the validity of our results is not confined to 2-D) and a theory with the free parameters 
being the value of the two-point correlation function wo in different angular bins. Thus, 
in this example, £ is a function of the data and the parameters wg. We show that, for 
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each 6, C peaks at 




DD{9) - DR{9) + RRje) 
DD{e) 



(1) 



where DD, DR, and RR are respectively the number of pairs of particles in the data set 
with angular distances within the bin denoted by 6; the number of pairs of particles — one 
in the data set and one in a random catalogue — in the 6 bin; and the number of random 
pairs in the bin 6. This estimator is very close to the one introduced by Landy & Szalay 
(1993). (For other estimators, see Peebles (1980), Hewett (1982), and Hamilton (1993); 
for a recent discussion of the Landy & Szalay estimator and a generalization to higher 
order correlations, see Szapudi & Szalay (1997)). We regard this as a success of the 
expansion: we are able to extract a reasonable estimator, one shown to be significantly 
more effective than the traditional DD/DR. In §IV, we analyze these estimators more 
carefully using Monte Carlo simulations to see which has the lowest variance. These 
simulations suggest that the maximum likelihood estimator of Eq. |T] has a smaller variance 
than both the traditional estimators and the Landy & Szalay estimator. In the course of 
performing these simulations, we have uncovered additional terms in the variance of the 
Landy- Szalay estimator which dominates over the normal Poisson term when there are 
many galaxies in the survey. We present a simple derivation of these additional terms in 
Appendix B. 

While the application developed in §111 and §IV is useful, we feel the most important 
feature of our analysis is the realization that the likelihood function can be approximated 
in a meaningful way. This opens the door to a host of applications, some of which we 
speculate about in the conclusion. Finally, to improve the readability of the text, we 
have shifted most of the calculational details of the expansion in §111 to Appendix A. 



The probability of finding galaxies at positions xi,X2, ■ ■ ■ ,xn and no galaxies else- 
where in the survey volume V is given by 



II. The Likelihood Function 



+ 



X 



P[xi,X2, ...,xn; ^o{V)\w2{0),n] = exp{Wo} n^dVidV2 ■ ■ ■ (IVn 
W,{x,)W^{x2)■■■W^{xN) 
V2{xi, X2)Wi{x2,) ■ ■ ■ Wi{xn) + permutations 



+ 
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+ 1^2(2^1,3^2) ■ ■ ■ W2{xn-i, Xn) + permutations 



(2) 



where ^o{V) is the "proposition" that there are no galaxies in the sample other than those 
at the specified Xi. Here we have assumed that there are no higher order correlations. The 
general expression including higher order correlations was first written down by White 
(1979); we do not reproduce it here. So the free parameters in the "theory" are the 
density n and the two point correlation function, 102(0). The Wj's are then given simply 



Here the infinitesmal "volume" dVi surrounds the position Xi and can be either 2-D 
(areas) or 3-D (volumes). For a realistic survey, all of these volume integrals are to be 
weighted by the selection function of the survey: this replaces the probability that there 
are galaxies at the Xi and no others with the probability of observing galaxies at the Xi 
and nowhere else. In principle, by suitably recomputing the probability of the statement 
^oiy), complications such as redshift-space distortions could be included. 

We should also note that this expression assumes a particular model for the relation- 
ship between galaxy positions and the underlying "number density field": the galaxies 
are a Poisson sample of the density. That is, the probability of finding a single galaxy 
in an infinitesimal volume 5V around x is just n{x)5V -C 1. We can then use a suitable 
biasing prescription to connect the number density field n{x) to the mass density field. 

Equation ^ is deceptively simple. A given line contains m occurences of the two point 
function W2- The deceptiveness lies in the phrase "+ permutations": there are many, 
many terms included in this phrase. Consider the line with 14^2 's- The number of 
ways of choosing 3A^/4 galaxies and assigning each a factor of Wi{xi) is 



by 




-nV + — J dVidV2W2{xi,X2) 
1-n dV2W2ixi,X2) 



(3) 



(3iV/4)! iN/A)\' 



(4) 



The remaining N/4 galaxies can be arranged into 14^2 's in 



(iV/4)! 



(5) 



2N/8 (]v/8)! 
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ways. So the total number of terms on this one hne is 



^ '8Nf/' (6) 



{N/8)\ {3N/A)\ 



where the approximate equahty uses Stirhng's formula. Even for a small catalogue with 
a thousand galaxies, this line contains 10^^^ terms! Each of these must be calculated 
separately; and there are N/2 lines with comparable numbers of terms (and this is just for 
the Gaussian case!). Clearly, an exact calculation of the likelihood function is out of the 
question. We need to develop approximation schemes that will enable us to circumvent 
an exact calculation. 



III. Weak Correlation Limit 

We are especially interested in the regime where the correlations are weak (w <^ 1). 
In this regime, we might further expect non-Gaussianity to be negligible, at least when 
we start from an initially Gaussian density field as from inflationary theories. 

A. Zero Order Solution 

We want to expand Eq. ^ about W2 = 0. So let us first find the likelihood when 
W2 = 0. In this case the likelihood function is extremely simple; it reduces to the Poisson 
distribution: 

P[xi, X2, . . . , xn; %{V)] = exp{-nV} dVM ■ ■ ■ (IVn (7) 

If we differentiate this with respect to the one free parameter, the density n, we find that 
the likelihood function peaks when 

n = n = N/V. (8) 

The width of the likelihood function gives a measure of how accurately the parameter n 
is known. One way to estimate this width is to expand ln(£) around n = n: 

ln(P) = ln(P[n]) + - nf + . . . (9) 
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In this simple example, 

ln(P) = —nV + A^ln(n) + constant 

= -nV + N\n(n) + -(^)(n-n? + ... (10) 
2 V / 

So we can identify the width of the likelihood function as [n^ /NY^"^ = {N/V'^Y^'^, so that 
Sn/n = A^~^/^, which is of course the correct answer for a Poisson distribution. Note 
that the Poisson distribution considered as a function of n is skewed. So if we choose as 
our estimator (say) the mean rather than the mode, we would find a different answer; 
for a large survey, this is clearly a tiny effect. 

We now pursue this approach including correlations. Specifically, we calculate the 
first derivative of ln(P) and set it to zero in order to determine the free parameters. 
Then we calculate the second derivative to calculate the width of the likelihood function. 



B. First Order Solution for n 

The starting point is Eq. ||: 



a a,b 



\nP = Wo + iVlnn + ln 

a a,b a,b,c a,b,c,d 



n WliXa) + - E E W^iXa, Xt) n W,ix,) 
a ^ a b c 



+ ^EEEEw^2(x„f,)i^2(x„f,) n Wii^e) 

abed e 



where we have dropped irrelevant constants [such as In^]. We need to expand this 
to second order in W2', then when differentiating to find the maxmium, we will get a 
linear equation for W2- Therefore, only terms up to second order in W2 have been kept. 
Another facet of this expansion — which is detailed in Appendix A — is that all the Wis 
in the W2W2 W^i term (last in the square brackets above) can be set to one. Similarly, 
all but one of the Wis in terms of the type 14^2 HW^i can be set to one, etc. We have 
introduced the notation of superscripts on Y^, Y\. These indicate which galaxies should 
not be summed or mulitplied over. Thus, nieans sum over all a = 1, . . . , except 
a = b. For example, 

E^E(i-M(i-U, (12) 

a a 

the generalization to more or fewer superscripts should be plain. 

Now, a word about the factors of 2, which may be a source of confusion: the term 
with one W2 clearly has one factor of two to account for the fact that we are double 
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counting 1^2(^1,^2) and 1^2(^2,3^1)- The term with two W2^s obviously needs two of 
these types of factors. But it also needs another one to account for W2{xi, X2)W2{x3, X4) 
and 1^2(3^3,2:4)1^2(3^1,3^2), hence the 1/2^ factor before the last term. 

It is worthwhile here to introduce the notation of Landy & Szalay. For these purposes 
we divide the survey volume into K cells, each of which is so small that it contains at 
most one galaxy. Then 



/ 



dV 



V_ 
K 



K 

E 

i=l 



(13) 



where the sum over i includes even those cells that don't have galaxies in them. With 
this notation, Wq and Wi can be rewritten as 



W^i(5i) 



1 - ^^W2{xi,Xi) 



(14) 



(15) 



It is worth noting that if one goes over the original derivation by White (1979), the 
summation over i and j in the above two equations should, strictly speaking, range over 
only empty cells. We do not place such restrictions here, essentially working in the 
continuum limit, even though we represent the integrals as discrete sums. 

Now let's find the value of the density at the maximum of the likelihood function. 
We need to differentiate Eq. |ll] with respect to n. 



d\nP 

dn 



+ 



dWo N 
H 

dn n 

n W,{Xa) + ^ E E ^2(Xa, Xh) f[ W,{x,) 
a ^ a h c 



a (1,0 a. 



b,c 



X 



23 

d_ 

dn 



EEEE^2(x.,f,)iy2( 3'c, Xd) 



a,b 



n W, (Xa) + ^ E E W^2(Xa, Xh) U W, (f ,) 
a ^ a \) c 



(16) 



Here we have used the fact that dW2/dn = 0. To go further we need the other partial 
derivatives: 



dWp 
dn 



-V + 



nV^ 



^W2{Xi,Xj] 



dWi{Xa) 

dn 



V_ 



^W2{Xi,Xa) 



(17) 



Note first that setting Eq. to zero will give a solution of the form n = N/V + 0(^2). 
We therefore need to keep only terms linear in W2- After differentiating the numerator. 
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it will be of order W2- We keep only these linear terms and set the denominator to one. 
Thus, we are left with 

dlnP N nV"^ V 

i,j a,i 



nV 



The latter two terms differ only through a scaling and whether the sums range over the 
observed galaxies or the random catalogue; in the absence of clustering they will be equal 
and cancel. Hence, they will differ only by another factor of W2; to this order in W2, they 
therefore vanish. Thus we expect no linear correction to the simple estimate of n = N/V. 

C. First Order Solution for ^2(6*) 

Now, we find the maximum likelihood solution for W2', we will defer most of the 
algebraic details to Appendix A. 

First, we will need the derivatives of the correlation function integrals: 

^ = !^2Fe^ (19) 

where Qfj is one if the distance [in either angular space or real space depending on 
whether or not the survey has redshifts] between cells i and j lies in the bin 9. Using the 
definition of Landy & Szalay, this becomes 



dWp ^ rfV^ _ n^V^ RR 



GM - (20) 



where RR is the count of pairs at separation 9 in the random catalogue, as in LS, and 
Nr is the number of random galaxies put in; this just normalizes things. Similarly, 

dWi{Xa) _ nV ^ s dW2{Xa,Xb) ^ g 

dw2{9) K Y dw2{9) "'^ ^ ' 

Another way of viewing the sums over i, j is to think of them as sums over galaxies in a 
random catalogue with the same geometry (and selection function) as the actual survey 
but with K galaxies instead of A^. 
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We also define the data-data and data-random pair counts as 



a.,] 



b aj 

. are over observed galaxies and 



where again sums over a, b 
random catalogue. 

After quite a bit of algebra (see Appendix A for details), we find that 

dlnP 

' = KK 

-2 



(22) 



are over cells or the 



dw2{e) 



n^V"^ nV 
^^RR - —DR + DD 



{DD)W2{9) + E 



(23) 



where 



E 



nV 
'1{ 



II ^2( 



+ Xm^2(Xa,^6) 



i b 



(24) 



We reiterate that the sums over a, h indices are over galaxies in the catalogue, while those 
over i, j are over cells or equivalently over galaxies in a random catalogue with the same 
geometry. In the latter case the number of cells K can be set to the number of galaxies 
in the random catalogue Nr. 



Equation ^ would lead to a complicated (albeit linear) matrix equation for W2- To 
simplify, we note that E is usually small, and is, in fact, negligible to this order in 1^2, by 
an argument similar to that after Eq. |I8[ To see that this is so, note that ©a b simply 
the number of galaxies within the bin Q surrounding the galaxy at Xa- Equivalently, it is 

times the fraction of galaxies in the bin Q around Xa- If the galaxies were distributed 
randomly, this fraction would simply be (l/K) X^i ©^a- So the difference between the 
two terms in square brackets is due solely to the non-randomness of the survey, and so is 
proportional to w. Thus, each of the terms in square brackets in Eq. are of order w^] 
since they multiply terms of order w^-, these terms are quadratic in W2- Therefore, they 
do not contribute at the level we are interested in. 

Finally we are left with the relatively simple expression: 

91nP n'V . nV 



dw2{e) 



—^RR{e) 



N' 



R 



-DR{e) + DD{e) - DD{e)w2{e). 



R 



The likelihood function therefore peaks when 

[n'^V^/Nl)RR{e) - {nV/NR)DR{e) + DD{e) 



W2{9) 



DD{9) 



(25) 



(26) 
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This expression, while suggestive, is not yet complete. This expression for W2 depends 
on the as yet unknown parameter ttQ. As we have seen in the previous subsection, the 
likelihood function peaks when n = N/V, irrespective of w{6)] hence we can simultane- 
ously maximize the likelihood for both the density and correlation function. Inserting 
this value of n here, we find that 

(iVViV|)igi?(g) - iN/Nn)DR{e) + DDje) 

= dW) • ^ ^ 

This then is the maximum likelihood estimator for W2- It differs only slightly from the 
estimator proposed by Landy & Szalay; their estimator had RR in the denominator 
instead of DD. As a point of notation, we mention that LS refer to their estimator as 
{DD — 2DR + RR) / RR; the factor of 2 results from a normalization choice they have 
made (in the definition of d in their Eq. 46). 

In §IV, we will explore these estimators in greater detail. 



D. Width of Likelihood Function 



We now want to calculate the width of the likelihood function. Specifically, we are 
interested in the matrix 

_i _ dHnP 

= -dKdx; ^^^^ 

where the parameters in the theory, are the binned 102(6) and n. The variances for 
each of the individual quantities are then 1/C~l^ if all other parameters are held fixed, 
while Ca,f3 is the general covariance matrix if all parameters are allowed to vary. 
Let us first calculate by differentiating Eq. |18[ We find 

^n!n = -^-^Yl ^2(Xi, Xj). (29) 

So the variance in n is increased by the presence of correlations. 

The next element is obtained by differentiating Eq. |18| with respect to W2- We find 

= 2nV^^ - ^^DR (30) 



*Had we kept the terms in E in our expression for w, then the estimate for w{0) would depend on w 
in all other angular bins as well, i.e. on the other free parameters in the theory. After dispensing with 
E, we find that the estimate for w{d) still depends on one of the other free parameters in the theory, the 
density n. 
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Note that this vanishes if there are no correlations. Note also that since Eq. [T^ was 
accurate only up to order W2, this expression is accurate only up to order This is 
true for C~^^^ as well: since our expression for d\nP/dw2 is accurate only up to order 
W2, the second derivative of will not contain any information about the W2 dependence 
of the width. We will not be able to distinguish between the width due to a random 
catalogue from that due to a correlated one. Carrying through this differentiation on 
Eq. we find 

C-V),.,(,,) = DDie,)6o,,e2 (31) 

where 56»i,92 is the Kronecker delta. This agrees with Landy & Szalay's calculation, 
neglecting corrections of order W2- Correlations between W2^s at different 6''s would appear 
in the higher order terms. To assess the relative effectiveness of these two estimators, we 
could go back to our expansion and attempt to extract the order w| terms. Alternatively, 
we could perform a Monte Carlo to see which has the lower variance. We choose the latter 
option. 



IV. Estimators of the Two Point Function 

Here we would like to test the effectiveness of various estimators. Before presenting 
our results, we briefly review previous work. Landy & Szalay analyzed random catalogues, 
attempting to measure the variance of their estimator and the more traditional DD/ DR. 
They found that the variance of their estimator was very close to the expected Poisson 
value: 

mRR{e)' ^ ' 

Note that this is simply one over the expected number of galaxy pairs per bin squared, 
l/^pairs other hand, the DD/DR estimator gave a larger variance than this at 

large angles. They attributed this to the fact that at large distances, the number of 
galaxy pairs per bin goes up and the DD/DR variance has an additional term beyond 
Poisson which goes as 1/iVpairs- As A^pairs gets larger, this additional term eventually 
dominates. Bernstein (1993) simulated catalogues with non-zero ra-point correlations. 
He found that the Landy & Szalay estimator had a larger variance in this correlated case 
than what one would expect based on Poisson-counting. As we will see, our work agrees 
with both of these results. 
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We work in a 32^ box (the units are irrelevant). An outline of our recipe is: 

1. Generate a galaxy catalogue with of order 1500 galaxies. 

2. Compute the expected value of W2 from this catalogue using three different estima- 
tors: Landy-Szalay (LS), DD/DR, and our maximum likelihood (ML). 

3. Repeat steps 1 & 2, 200 times. 

4. Calculate the mean and variance of these estimators over the ensemble of realiza- 
tions. 

To generate a galaxy catalogue (step 1), we input our desired W2{0), and Fourier 
transformed to get the power spectrum. We then used this power spectrum to generate a 
density field everywhere on a 32^ grid. We chose the amplitude of W2 to be small enough 
so that 6 was never less than —1 anywhere on the grid. Then, we used the density field 
to produce a Poisson realization with at most ten galaxies per cell. Then we randomized 
the positions within each cell. Several comments about this proceedure: First, we are by 
necessity limited to small W2- Second, we do not trust our results on scales smaller than 
one cell size (x = 1). Third, because of periodicity, W2 obtained in this way is symmetric 
about the half-way point {x = 16), so we restrict our analysis to a; < 16. 

To calculate the expected value of W2 (step 2), we generate thirty random catalogues 
with one thousand particles each.[|] We calculate DD, DR, RR in each of seven bins, the 
lowest at 1 < s < 3 and the highest at 13 < x < 15. From these, we construct the three 
estimators of interests. 

Our results are shown in Figures 1 and 2. Figure 1 shows that all of the estimators 
come very close to the true value of W2- The variance of the LS estimator is indeed 
significantly smaller than DD/DR at large separations when the 1/A^pairs factor begins 
to overtake the Poisson variance. The variance of the ML estimator is similar to LS, but 
as shown in the top panel of Figure 1, appears to be smaller when W2 is non- negligible. 

There is one other feature of our analysis which bears note. Figure 2 shows the 

variance of the LS estimator as compared with the "expected" Poisson variance, Eq. ^ 

The variance of the LS estimator is significantly larger when W2 is non- negligible. We 

believe this is real and that there are two non-negligible additional terms in the LS 

tTliis should be equivalent to one catalogue of 30,000 particles but is computationally faster to 
analyze. We have checked that this is sufficient by calculating the expected variance of W2 for a random 
catalogue a la Landy & Szalay. We agree with the expected variance in that case. 
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Fig. 1: The bottom panel shows the input W2 and the estimated values, using three different 
estimators. The error bars for the Landy & Szalay estimator are plotted along with those for DD/DR. 
At large distance (with many galaxy pairs per bin) the Landy & Szalay estimator has significantly 
smaller variance than does DD/DR. The error bars for our maximum likelihood estimator are similar 
to those of the Landy & Szalay estimator so are not shown explicitly in the bottom panel. The ratio of 
the two is shown in the top panel. For bins with non- negligible W2, the maximum likelihood estimator 
appears to have a smaller variance. 
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Fig. 2: The ratio of the rms variation of the Landy & Szalay estimator to the "expected" Poisson 
rms, y/Ny{N^RR). When W2 is non-negligible (at small x) the rms variation is significantly larger 
than Poisson. (Blue) circles are variance obtained from 200 simulated catalogues; (red) squares are the 
variance expected from analytic results in Appendix B. 
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variance, one proportional to W2/N, and one to Appendix B derives these additional 
terms analytically. Both the numerical and the analytic result agree with Bernstein's 
results. The differences are: we have included edge effects and we have isolated the fact 
that the discrepency is due to these added terms in the variance. It is not due to higher 
order cosmic correlations, as these are excluded by construction in our mock catalogue. 



V. Conclusions and the Future 

The likelihood function would be a wonderful object to compute for a galaxy survey. 
By construction, it would use all of the information from the survey and allow us to 
compare different theories. As such, we feel it is very important to explore the possibility 
of computing this function. Direct computation is impossible, but we have developed 
an approximation scheme which appears to work very well. Before specifically detailing 
the use we have made of this approximation, we want to emphasize that there are many 
ways to branch out from here: 

• Use a theory such as Cold Dark Matter with its one or two free parameters. The 
approximation scheme developed here can then be used to extract the best fit values 
of these parameters. 

• Generalize the approximation to include higher-order correlations. This would have 
the benefit of using the higher-order correlations together with the two-point func- 
tion to constrain theories. Alternately, we could use the ansatz of hierarchical 
clustering and results from gravitational perturbation theory to generate higher- 
order moments from the two-point function. 

• Generalize this work to Fourier space. Theories are most easily compared in Fourier 
space so this is a natural way to go. Just as the ML procedure generates a (diagonal) 
matrix equation for W2, in Fourier space we have an integral equation for P{k). 

• Find a graphical method which simplifies, and helps organize, the expansion we 
have introduced. We have relied on arguments like: certain terms, such as E (Eq. 
P^ ), are implicitly of the order of W2^, even though they do not appear explicitly 
so. It would be nice to make these more precise and systematic. 
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• Go beyond the perturbative approach and try to learn something from the full 
likelihood function. This is not as impossible as we made it sound in §11: without 
actually computing one might still make some very general statements. 

We have made progress in the latter four areas. This will be presented in a future 
paper. 

In this paper we have limited ourselves to one application: finding the place where 
the likelihood function peaks if the theoretical parameters are the binned values of the 
two-point function. Equivalently, we have come up with a new estimator for W2- We 
found that 

• The maximum likelihood (ML) estimator appears to have a slightly smaller variance 
than the Landy & Szalay (LS) estimator and certainly than DD/DR. To the extent 
that the the ML and LS estimators are similar (and they are very similar) this whole 
treatment can be thought of as further motivation for the LS estimator. 

• There are additional terms in the variance of the LS estimator beyond the Poisson 
variance. These terms begin to dominate when correlations are non-negligible and 
the number of pairs of galaxies per bin is large. 
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Appendix A. Derivation of Weak Correlation Limit 



We will take derivatives of Eq. |Tl| with respect to W2', and then expand it out to first 
order in W2- 

d\nP n'^V^RR 



dw2{e) 
+ 

X 



a,b 



1 -1 



a a b c 



nV 
1{ 



d 



E 0' . n (5a) + - E E { (f a , f n (5c) } 



a b 



+ 



1 

23 



a,b a,b,c 



EEEE 



d 



a,b,c,d 



-{W2{xa,Xb)W2{x,,Xd) n W,{x,)} 



(Al) 



a b c d ^MO) 

Since we are keeping only first order terms here, we have dropped the W2 term in the 
denominator. We can go further though. The term in the denominator linear in W2 is 
multiplied by the product of Wis; these can all be set to one. Similarly, the derivative 
operator acting on the last term only affects the W2's] the WiS can again all be set to 
one. This gives: 

d\nP n^V^RR 



dw2{e) 



N'r 
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X 
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nW^l(5a) + ^EE^2(fa,X,) 
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nV 

IT 



E 0' . n (5a) + ^ E E -Q^. { (X. , f .) n W^l (f e) } 
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a a,b a,b,c 
????^«^2(^) 
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{W2i ■^ay -^b )W2ix,,Xd)} 



(A2) 



Now carry out the derivatives: 

9 In P n^V^RR 



dw2{e) 
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EE01) 
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a a,,b a,b,c 



Now expand the denominator: 
1 

So, 

dlnP 



+ ^ {^%W2{Xc, Xd) + eldW2{Xa, Xb) 



1- oEE^2(Xa,Xt,) - ^[l^l(fa) - l] 
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+ 

X 
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Multiplying through, we find 



Xay Xfy 



d\nP 
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^ a b c d 

^EE^2(x.,f,)+E(W^l(^a)-l) 

-^E0l. + ^EE0lJ (A6) 

^ b,i a b 

There are three sets of terms here, those which have W2 explicitly in them; those which 
are independent of W2; and those which depend on W2 only through Wi — 1. Let us treat 
each of these in turn. 

First consider the terms independent of W2 in Eq. \A6[ 



RR- '^^01. l^tel, 

b,i o- b 



n^V^ nV 
-^RR--DR + DD 



(A7) 
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Next consider the terms which depend exphcitly on W2- 



2K 

-^EE^2(^a,^6) 



23 

^DR + DD 

Nr 



a b c 



(A8) 



We claim that the two terms in the quadruple sum on the first line are identical. To see 
this, first switch indices a c; 6 c? in the first term. Then it is: 



ax c.d.a 



c c,d c,d,a c 

E E E E ^id^2{Xa, ^b) = E E E E ^idM^a, Xb) 

c d a. b a c d b 



(A9) 



where we have switched the summations and taken care to guard against summing over 
identical a = c for example. Continuing in this fashion, we get the second term in the 
quadruple sum. Thus, all terms with explicit W2^s in them are: 



EE^2(^a,^6) 
a b 



T ^ (1.6 1 ci.h a.h,c 1 T / 

"'i:Ee?,.4i:i:e«,-i(-=;^i.is + i.i.) 



2K 



e i 



c d - ^'R 

Consider the first term in brackets. We can rewrite the sum over e as 



(AlO) 



= E (1 - 5«e) (1 - Sbe) = E (1 - ^ae - he) 



(All) 



Note that since a is never equal to b, this is exactly true. Now the unrestricted sum over 
e simply gives DR with the sign and coefficient exactly right to cancel the third term in 
brackets. So the terms with explicit W2 dependence are: 



EE^2(:?a,^fe) 
a b 



^E(0l + 0tb) + ^EE0l.-^^^ 



c d 



(A12) 



The same argument holds for the DD terms. The only terms which remain in the sums 
over c, d are those wherein c or d equals a,b,c. Let us do this carefully because there 
might be subtleties here. 



a,b a,h,c a,b c 

EE = EE(i-'^^a-'^'^^) 

c d c d 

= EE[i-(<^- + <^bc)]-EE(^<^a + <^<^^) 

c d c d 



(A13) 
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So the exphcit W2 terms become 



a b 
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(A14) 

We want to focus on one of these terms: the one in which the index d is equal to a or h. 
Thus rewrite this last line as 



EE^2( 



•^a^ -^b ) 



1 °' 

9 EE^2(Xa,ffe)e^^fe 
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(A15) 



where we have made use of the symmetry between a, h to sum up lots of identical terms. 
Now consider the last term. The 0^ j, requires all separations between Xa and Xb to lie 
within the bin Q. Within this bin, by definition, W2 is constant [this is the theory we are 
trying to solve for]. Thus, W2 comes out of the sum and we are left with simply DD . So 
the terms with only explicit W2 dependence are 



EE^2(^a,^b) 
a b 



nV g ^ Q 

"fT E ®i,a ~ E ®a,rf 
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We can now reinsert all this back into Eq. |A6| . 
ainP 
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(A17) 



The terms in the middle two lines nearly cancel, except for the restrictions on the sums. 
The only terms that remain from these two lines are those in which c on the third line 
equals a or h. Thus we are lead directly to Eq. |2^. 



Appendix B. Variance of the Landy &; Szalay Estimator 

To derive the variance of the Landy & Szalay estimator, let us rewrite it in the 
following form: 
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^LS(^) = E - (^QdiQj - O , (Bl) 

where 

and Qi equals one where the cell contains a galaxy, and equals zero otherwise, whereas g[ 
is similarly defined for the random catalogue. The factor a scales the average density in 
random catalogue back to the level in the actual catalogue: in other words, a = N/Nji 
with and Nji being the total number of galaxies in the actual catalogue and in the 
random catalogue respectively. 

It can be shown that the introduction of the random catalogue introduces extra 
variance (i.e. variance of the random catalogue itself), which could be eliminated if one 
uses a large number of random catalogues, or if one allows the number of galaxies in 
random catalogue to increase dramatically. We will compute the variance of W2^{9) in 
this limit, in which case, one can replace every ag[ by q where q = (qi). Hence, Eq. ^1] 
is equivalent to 

^'2^iO)=Y.W^AWJ (B3) 

with 

2-im,n ^m,n\" ) 

and 5i = {qi -q)/q. 

The variance is defined by (['^2^(6')]^) — (^2^(6'))^. This means one needs the following 
quantity: 

{SiSjSkSi)-{6,6,){6k6i) (B5) 
= [w2{xi,Xk) + Sikqr^][w2{xj,xi) + 6jiqi~'^] + {k ^ I) 

which is adopted from Hamilton (1997). This expression holds in the small pixel (contin- 
uum) limit, with the restriction i ^ j and k ^ I (for nonzero 9) and for Gaussian random 
underlying field. We have also ignored terms the contributions of which to the variance 
are of order of smaller than those we have kept. 
Putting everything together, we obtain: 

Variance - ^ + 4 E,,, 9.,(g)9.,(g)^2(x,, 

Variance - ^^^^ ^^^^^^^ + ^^^^^^^^^ (B6) 

^ 2 T.i,j,k,l Qi,j{d)Qk,l{^)w2{Xi, Xk)w2{Xj, Xi) 
Em,n 0m,?i(^)]'^ 
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The first term on the right is the Poisson variance given by Landy & Szalay (1993). 
The second two terms arise because of finite two-point correlations. They have been 
derived before by Bernstein (1993), ignoring edge effects, and by Mo, Jing, & Borner, 
ignoring discreteness effects. Note that both Landy & Szalay and Bernstein obtained 
extra terms for the Poisson variance, which we have ignored because they are of the 
order of 1/A^ smaller than those we have kept. The variance in Eq. can be estimated 
using the standard technique of counting pairs, triplets and quadruplets using a random 
catalogue with the same geometry. For instance one can estimate Z^jj 0i,i(^)/2 by 
[N^lNl]RR{e). 
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